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ON THE STABILITY OF ROTATED FACTOR LOADINGS: THE WEXLER PHENOMENON 



ABSTRACT 

The formulas which give the "standard errors of factor loading estimates 
while available and computable are complicated and our understanding of 
them is limited. A nontechnical description of their behavior under favor- 
able and unfavorable conditions is given. Of particular interjet.t_is their 
behavior in the presence of singularities arising from^bqual eigenvalues 
and undefined rotation. / 



ON TIE STABILITY OF ROTATED FACTOR LOADINGS: THE WEXLER PHENOMENON 



1, INTRODUCTION 

Numerous authors have looked at the sampling stability of loading 
estimates which arise in factor analysis • While analytic results for the 
standard errors of unrotated loadings have been available for some time 
(Lawley, 1955, 1967; Lawley & Maxwell, 1971) those for analytically rotated 
loadings are fairly recent (Archer & Jennrich, 1975; Jennrich, 1973a, b). 
Consequently much of the early work on stability was based on simulation 
studies. After reviewing several studies, Cliff & Bamberger (1966). found 
that the standard errors of factor loading estimates were about the ^same 
as those for correlations, that is about l/>/n in magnitude f or a sample 
of size n . This is a crude but useful summary of some rather com- 
plicated results. It is useful because it is simple and reasonably 
accurate when everything is going right and crude because it can be 
fairly wide of the mark when this is not the case. Its usefulness may be 
considerably enhanced by understanding the mechanisms which cause it 
to be inaccurate. To identify some of these we begin with a result from 
an interesting unpublished dissertation by Wexler (1968). 

Wexler investigated the finite sample variances of maximum likelihood 
factor loading estimates comparing them with those obtained using the 
asymptotic formulas of lav/ley (1953). These were Lawley' s early results 
derived under the assumption of known unique variances , a restriction 
which was later removed (Lawley, 1967)* Lot A be the unrotated (i.e., 
canonical, Rao, 1955) factor loading matrix for a population satisfying 
the usual assumptions in maximum likelihood factor analysis (Lav/ley & 
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Maxwell, 1971). Let A be the varimax rotation of A and let T be the 
transformation which takes A into A so that 

A = AT . (1) 

Wexler looked at two types of estimates for the rotated loadings A . 
The first was the estimate 

A* = AT (2) 

-« 

/\ 

computed from the maximum likelihood estimate A of A using the popula- 

/\ 

tion value T . The second was the maximum likelihood estimate A of A 
computed without assuming T is known so that 

a = at (y) 

where T is the matrix which takes A into its varimax rotation A . 
Lavley's formulas give the asymptotic standard errors for the components 
of A and by a simple transformation the asymptotic standard errors for 
the components of A* . Because T is a function of the data, the 
asymptotic standard errors for the components of A are a little more 
cimplicated (Archer & Jennrich, 1975) and were not available at the 
time of Wexler 1 s study. 

In general Wexler* s simulation studies showed reasonably good agree- 
ment with Lav/ley 1 s formulas. Figure 1 is reproduced from his thesis. It 
shows loading variances computed using Lawley 1 s formulas plotted against 
actual simulated variances for the components M of A* . There is 



one point for each loading in A . The agreement is not perfect, 
but taking into account the fact that only 100 simulations were used, 
there are no statistically surprising departures from the asymptotic 
results nor do there appear to be any systematic ones. 



Insert Figure 1 about here 



Because the asymptotic variances for the A loadings were not 
available at the time of Wexler' s thesis, it had been suggested that the 
asymptotic variances of the A* loadings be used as an approximation for 
the variances of the A loadings. To test this suggestion Wexler plotted 
the asymptotic variances of the A* loadings against actual simulated 
variances for the A loadings. His plot is given in Figure 2. Many who 
have seen this figure find it somewhat surprising - . First the suggested 
approximation doe--; not seem to be satisfactory. But of greater interest 
to us is the fact that for the most part the leadings computed using ' 
are considerably more stable than those using the true population value 
T . We would like to understand why this is so. 



Insert Figure 2 about here 



The results in Figure 2 were obtained using a population with a 
very good varimax loading matrix, i.e., one with nice simple structure. 
Wexler repeated his entire analysis using a population with only fair 



simple structure. The results corresponding to Figure 2 are .shown in 
Figure 5* The phenomenon displayed in Figure 2 is far less pronounced 
here. It is still abundantly clear, however, that it would be unwise 
to use the asymptotic variances of the components of A* to approximate 
those of A . 



Insert Figure 5 about here 

These examples suggest what we shall call the Wexler phenomenon: 

When Good simple structure exists rotated loadings may be 

surprisingly stable . 
One manifestation of this phenomenon is that rotated loadings may be 
considerably more stable than unrotated loadings. The opposite can also 
happen and this suggests the anti-Wexler phenomenon: 

When good simple structure does not exist rotated loadings trfay 

be surprisingly unstable . 
We intend to investigate these somewhat vague statements in greater detail. 

2. FORMS OF DEGENERACY 

We believe that the Wexler phenomena are associated with forms of 
degeneracy in the specification of a x'actor analysis model. Of particular 
interest are those which arise from: 
(i) equal eigenvalues 
(ii) undefined rotation . 
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For the purpose of thinking about these we specialize to a simple case. 
We shall in fact leave factor analysis entirely and consider the case of 
principal components analyst with quartimax rotation. This simplification 
allows us to understand the details clearly without sacrificing the 
essential issues at hand. 

It is easy to give an example which displays both forms (i) and (ii) 
of degeneracy. let the 5 by 2 matrix A given in Figure k represent the 
first two principal components of a 5 by 5 population covariance matrix. 
In practice it is common to plot the rows of A . As displayed in Figure 
h they constitute three equally spaced points on the unit circle. The 
first two eigenvalues here, being the column suns of squares for A , are 
equal. Thus A has the first form of degeneracy. On the other hand it 
is easy to show that the quartimax criterion (Harman, 1967* P* 2-98) is 
constant here over all orthogonal rotations of A • As a consequence, 
the quartimax rotation of A is undefined and A also displays the 
second form of degeneracy. 



Insert Figure h about here 

To understand their effect we shall look at examples showing these 
forms of degeneracy separately. l>t tin h by 2 matrix A given in 
Figure 5 represent the first two principal components of a h by h 
population covariance matrix. The first two eigenvalues here are clearly 



\ 
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equal* On the other hand, as can be seen from Figure 5, A has an 
independent cluster structure so its quartimax rotation is well defined. 
The matrix A is in fact its own quartimax rotation. We want to consider 

the statistical stability of an estimator A of A obtained by factoring 

«<\ /\ 
a sample covariance matrix, and that of A the quartimax rotation, of A . 

When eigenvalues are equal as they are here we know from the results of 

Anderson (19^5) that as the sanrple size n -» °° 9 

A -*AX (k) 

in distribution where X is a random orthogonal matrix. This means that 
for large n with high probability a plot of the rows of A will look 
like a random rotation of a slight perturbation of the points displayed in 
Figure 5* Because of the random rotation 

A/>A (5) 

as n -> «> so that A is not a consistent estimator of A . On the other 

/\ 

hand quartimax rotation of A will undo .the random rotation so that when 
n is large A will with high probability look like a slight perturbation 
of A . In more precise terms (k) and the fact that A is its own quarti- 
max rotation imply that 

A - A (6) 

in probability as n -» «> . Thus for large n the rotated loadings A , 
which converge, will have a greater stability than the unrotated loadings 
A , which do not, giving rise to the Wexler phenomenon. 



Insert Figure 5 about here 



Consider next the example given in Figure 6. As before, let A 
represent the first two principal components of a 3 by 5 population 
covariance matrix. The eigenvalue ratio here is a comfortable 2.71* On 
the other hand the constant c in A has been 'carefully chosen so that the 
quartimax criterion is constant over every orthogonal rotation of A 
making the quartimax rotation of A undefined. 

Insert Figure 6 about nere 

* 

As before let A be an estimate of A obtained by factoring a 

sample covariance matrix and let A be the quartimax rotation of A ♦ * 

Because of the eigenvalue ratio (and assuming the third eigenvalue is not 

equal or nearly equal to the second; we expect A to be near A when n 

is large. Since qua rtimax^ rotation is undefined at A , however, we 

expect small changes of A in a neighborhood of A to produce large 

✓\ 

changes in the rotated loadings A . That is we expect the stability 

/N A 

of A to be poor compared to that of A giving rise to the anti- 
Wexler phenomenon. As we shall see in the next section, this in fact 
happens . 
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The first form of degeneracy discussed here was considered by 
Jtfreskog (1963, p. 86) using a form of estimation proposed by him together 
with target rotation. He observed that when the appropriate eigenvalues 
are nearly equal his loading estimates could be expected to have large 
variances before rotation but moderate variances after. This, in a slightly 
different context, is a manifestation of the Wexler phenomenon. JSreskog 
did not consider cases corresponding to Figures h and 6 possibly because 
they were not particularly interesting in the context of target rotation. 

3. TV/0 CONFIRMATORY EXAMPLES 

By considering the specific case of principal components analysis 
and orthomax rotation we have set forth rationales for the Wexler and 
anti-Wexl^r phenomena. Tn this section by looking at two specific examples 
and computing exact asymptotic variances we will verify that the two forms 
of degeneracy considered do in fact produce the Wexler and anti -Wexler 
phenomena. To demonstrate the generality of the arguments given, we 
choose examples which are technically quite different but clearly analogous 
to those of the last section. They are based on standardized (i.e., com- 
puted from a sample correlation matrix) maximum likelihood loading estima- 
tion and varimax rotation. Clearly maximum likelihood factor analysis is ' 
not the same as the principal components analysis nor is varimax rotation 
the same as quartimax. 

]>t A be the matrix on the left in T-jble 1' and let it represent 
the standardized loadings in a canonical (Rao, 1$*55) factor analysis model 



-9- 

for a normal population. The model involves 2 factors and 6 score vari- 
ables. This is the smallest number of variables a two factor model with 
perfect structure can htve and still have identifiable unique variances 
(see the identifiability conditions summarized by Anderson & Rubin, 1956)* 
The appropriate eigenvalues here arc the diagonal elements oT A f \jT^A 
where » diag(l - AA* ) is the matrix of standardized unique variances. 
(Estimates of these eigenvalues are ^ovided by standard maximum likelihood 
factor analysis programs.) Because the eigenvalue ratio here is nearly one 
(1*07 to too decimal places) analogy with the example of Figure 5 suggests 
that the standard .errors for the maximum likelihood estimates of the 
components of A will be quite large. On the other hand the perfect 
structure of A suggests that varimax rotation may produce estimates with 
relatively small standard errors. Using A and the formulas of Jennrich 
(1973b) the asymptotic standard errors of both the unrotated and rotated 
estimates were computed and are recorded in Table 1. Clearly at least some 
of the unrotated loading estimates are highly unstable while all of the 
rotated loading estimates are quits stable. In the worst cases the 
standard errors of the unrotated loadings are about 20 times as large as 
the corresponding rotated loadings. It is easy to believe from this 
example that the Hexler phenomenon may be made arbitrarily pronounced by 
choosing population values sufficiently close to the appropriate form of 
degeneracy. 



Insert Table 1 about here 



-10- 

Tiirning to the anti-Wexler phenomenon let A be the u *>y 2 matrix 
on £he left in Table 2 and as before let it represent tho standardized 
loadings in a canonical factr> analysis model for a normal population. The 
eigenvalue ratio here is *l.25 so that arguing by analogy with the last exam- 
ple in the previous section we expect the maximum likelihood estiu&tes of 
the: urirotated loadings to have moderate standard errors. On the other hand 
":tKi value .81 in the upper left-hand corner of A was carefully chosen so 
-^Kafe'the varifnax rotation of A is nearly undefined (the precise value 
;^RicR renders it undefined is (*^5) ^^l^ = .8605 )• Thus it is reasonable 
to ejcpect large standard errors for the maximum likelihood .estimates of the 
^rotated loadings. 

Using the formulas of Jennrich (1975b)' again, the actual asymptotic 
standard errors are given in Table 2. As expected the standard errors for 
the unrotated loadings have moderate values while at least some of those 
for the rotated loadings are quite large. In the worst cases the latter 
are about 26 times as large as the former and it is easy to believe that 
the anti-Wexler phenomenon may be made arbitrarily pronounced by choosing 
ah appropriate sufficiently singular example. 



Insert Table 2 about here 
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h. DISCUSSION 

. Standard error formulas for analytically rotated factor loadings have 
only recently bpoome available. While we can now compute standard errors 

at will the computations which lead to them are complicated and our under- 

7 

standing of them is quite limited. A crude but simple summary asserts that 
for a factor loading estimate (of any kind) the 

standard error = l/^a • (7) 

Both unrotated and rotated loading estimates, however, can be made to 
have arbitrarily large standard errors by choosing examples with the 
appropriate form of singularity. Interestingly, rotated 'loadings need 
hot. have large standard errors simply because the unrotated loadings from 
which they are computed do. And conversely very stable unrotated loadings 
can lead to very unstable rotated loadings. We have called these observa- 
tions the Wexler and anti-Wexler phenomena and we know in some detail 
why the summary (7) must be crude. 

An approximation proposed by C. W. Harris and reported by Cattell 
(I966, p. 235) asserts that for a factor loading estimate ^ the 



V n - k - 1 ' 



standard error = \— ^= — — J (8) 

2 

where Ik is the communality of the i -th variable, <t» rr is the r -th 
diagonal element in the inverse of the matrix of factor correlations, and 
k is the number of factors. Since there is nothing in this formula which 
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allows for the effect of nearly equal eigenvalues or poorly defined 
rotation we know from consideration of the Wexler phenomena that this 
too must, in some cases at least, be a very crude approximation. 

On the other hand a summary as simple as (7) can be quite useful when 
it is necessary to make simple inferences without the aid of a computer. 
Lawley & Maxwell (1971> **5) present a maximum likelihood factor analysis 
involving 211 observations, 9 variables, and 3 factors. The eigenvalues 
involved are quite distinct and direct quartimin rotation (Jennrich & 
Sampson, I966) gives loadings with good simple structure so that one might 
be tempted to use 



standard errors computed by Jennrich (1973a) are reproduced in Table 3. 
Considering the simplicity of the formula which led to the value .069, 
its agreement x^ith the computed values is rather pleasing and good 
enough for rough inferential purposes (cf Jennrich, 1975a). Because 
of the Wexler phenomenon, -however, one must use a good deal of caution 
with such an approximation when eigenvalues are not clearly distinct or 
rotations are not well defined. 




(9) 



as a standard error for the rotated loadings* The actual asymptotic 



Insert Table 3 about here 
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The author is indebted to Norman Wexler for making his unpublished 
results available and for reviewing an earlier draft of this manuscript. 
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pointing out years ago 



the need for results on the statistical stability 



of factor loading estimates* 
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Table 1. —Asymptotic Results in the Case of Nearly Equal Eigenvalues 
and Well Defined Rotation: Hexler Phenomenon 

Asymptotic Standard Errors* 

Population Loadings* 

Unrotated Rotated 



.81 


.00 


.05 


1.51 


.05 


.07 


.81 


.00 


.05 


1.51 


.05 


.07 


.81 


.00 


.65 


1.51 


.05 


.07 


.00 


.80 


1.59 


.05 


.07 


.05 


.00 


.80 


1.59 


.05 


.07 




.00 


.80 


1.59 


.05 


.07 


.05 



*The rotated and unrotated loadings are identical here. 
+ 

Standard errors are scaled to correspond to a sample size of 100. 
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Table 2.— Asymptotic Results in the Case of Nearly Undefined Loadings 
and a Good Eigenvalue Ratio: Ant i-Wexler, Phenomenon s % 

Asymptotic Standard Errors* 

Population Loadings* 







Unrotated 


Rotated 


'.81 


.00 


.06 


.09 


.06 


2.33 


.81 


.00 


.06 


.09 


.06 


2.33 


-M 




.11 


.12 


1.27 


1.31 


-.^3 


.43 


.11 


.12 


1.27 


1.31 


-.43 


-.45 


.11 


.12 


1.27 


1.31 


-.^3 


-.43 


.11 


.12 


1.27 


1.31 



*The rotated and unrotated loadings are identical here. 
+ 

Standard errors are scaled to correspond to a sample size of 100. 
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Table — Asymptotic Standard Errors for a Direct Quartimin Rotation of 
Maximum Likelihood Loading Estimates: Reproduced from Jennrich (l97?a) 







Factor 




iate 


I 


II 


III 


1 


.074 


.082 


.096 


2 


.069 


.082 


•075 


5 


.072 


.055 


.067 


k 


.054 


.084 


.075 


5 


.065 


.069 


.041 


6 


.046 


.058 


.056 


7 


.064 


.056 


.141 


8 


.064 


.081 


.116 


9 


.046 


.050 


.059 
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FIGURE CAPTIONS 

FIG. 1 — Scatter plot of asymptotic variances versus empirical 
variances of loadings estimating the elements of a good simple structure 
population factor matrix-. The populate "a transformation matrix was used. 
Multiplicity of plots is indicated by encircled points. 

FIG. 2 — Scatter plot of asymptotic variances versus empirical vari- 
ances of loadings estimating the elements of a good simple structure 
population factor matrix. The sample transformation matrix was used. 
Multiplicity of plots is indicated by encircled points. 

FIG. 5—Scatter plot of asymptotic variances versus empirical 
variances of loadings estimating the elements of a population factor 
matrix with only fair simple structure. The sample transformation 
matrix was u'sed. Multiplicity of plots is indicated by encircled points. 

FIG. k—k principal components example displaying both forms of 
degeneracy: Equal eigenvalues and undefined rotation. 

FIG. 5 — A principal components example displaying the degeneracy 
which leads to the Wexler phenomenon: Equal eigenvalues with well-defined 
rotation. 

FIG. 6--A principal components exqmple displaying the degeneracy 
which leads to the anti -Wexler phenomenon: Undefined rotation with 
distinct eigenvalues. 



-20- 




Asymptotic Factor Loading Variances 
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A = 



-c 




c = 2 



eigenvalue ratio = 2.71 



